white box model
Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization
Post-training quantization (PTQ) has recently emerged as an effective tool for reducing the computational complexity and memory usage of a neural network by representing its weights and activations with lower precision. While this paradigm has shown great success in lowering compute and storage costs, there is the potential for drastic performance reduction depending upon the distribution of inputs experienced in inference. When considering possible deployment in safety-critical environments, it is important to investigate the extent of potential performance reduction, and what characteristics of input distributions may give rise to this reduction. In this work, we explore the idea of extreme failure stemming from dynamic PTQ and formulate a knowledge distillation and reinforcement learning task to learn a network and bit-width policy pair such that catastrophic failure under quantization is analyzed in terms of worst case potential. Our results confirm the existence of this "detrimental" network-policy pair, with several instances demonstrating performance reductions in the range of 10-65% in accuracy, compared to their "robust" counterparts encountering a <2% decrease. From systematic experimentation and analyses, we also provide an initial exploration into points at highest vulnerability. While our results represent an initial step toward understanding failure cases introduced by PTQ, our findings ultimately emphasize the need for caution in real-world deployment scenarios. We hope this work encourages more rigorous examinations of robustness and a greater emphasis on safety considerations for future works within the broader field of deep learning.
Interpretable Summaries of Black Box Incident Triaging with Subgroup Discovery
Remil, Youcef, Bendimerad, Anes, Plantevit, Marc, Robardet, Cรฉline, Kaytoue, Mehdi
The need of predictive maintenance comes with an increasing number of incidents reported by monitoring systems and equipment/software users. In the front line, on-call engineers (OCEs) have to quickly assess the degree of severity of an incident and decide which service to contact for corrective actions. To automate these decisions, several predictive models have been proposed, but the most efficient models are opaque (say, black box), strongly limiting their adoption. In this paper, we propose an efficient black box model based on 170K incidents reported to our company over the last 7 years and emphasize on the need of automating triage when incidents are massively reported on thousands of servers running our product, an ERP. Recent developments in eXplainable Artificial Intelligence (XAI) help in providing global explanations to the model, but also, and most importantly, with local explanations for each model prediction/outcome. Sadly, providing a human with an explanation for each outcome is not conceivable when dealing with an important number of daily predictions. To address this problem, we propose an original data-mining method rooted in Subgroup Discovery, a pattern mining technique with the natural ability to group objects that share similar explanations of their black box predictions and provide a description for each group. We evaluate this approach and present our preliminary results which give us good hope towards an effective OCE's adoption. We believe that this approach provides a new way to address the problem of model agnostic outcome explanation.
Developing a Fidelity Evaluation Approach for Interpretable Machine Learning
Velmurugan, Mythreyi, Ouyang, Chun, Moreira, Catarina, Sindhgatta, Renuka
Explainable AI (XAI) methods are used in order to improve the interpretability of these complex "black box" models, thereby increasing transparency and enabling informed decision-making (Guidotti et al, 2018). Despite this, methods to assess the quality of explanations generated by such explainable methods are so far under-explored. In particular, functionallygrounded evaluation methods, which measure the inherent ability of explainable methods in a given situation, are often specific to a particular type of dataset or explainable method. A key measure of functionally-grounded explanation fitness is explanation fidelity, which assesses the correctness and completeness of the explanation with respect to the underlying black box predictive model (Zhou et al, 2021). Evaluations of fidelity in literature can generally be classified as one of the following: external fidelity evaluation, which assesses how well the prediction of the underlying model and the explanation agree, and internal fidelity, which assesses how well the explanation matches the decision-making processes of the underlying model (Messalas et al, 2019). While methods to evaluate external fidelity are relatively common in literature (Guidotti et al, 2019; Lakkaraju et al, 2016; Ming et al, 2019; Shankaranarayana and Runje, 2019), evaluation methods to evaluate internal fidelity using black box models are generally limited to text and image data, rather than tabular (Du et al, 2019; Fong and Vedaldi, 2017; Nguyen, 2018; Samek et al, 2017). In this paper, weproposeanovelevaluation method based onathree phase approach:(1) the creation of a fully transparent, inherently interpretable white box model, and evaluation of explanations against this model; (2) the usage of the white box as a proxy to refine and improve the evaluation of explanations generated by a black box model; and (3) test the fidelity of explanations for a black box model using the refined method from the second phase. The main contributions of this work are as follows: 1.
Explanations of model predictions with live and breakDown packages
Staniak, Mateusz, Biecek, Przemyslaw
Predictive modelling is a very exciting field with many different applications. Lots of algorithms have been developed in this area. According to many Kaggle competitions (Fogg, 2016), winning solutions are often obtained with elastic tools like random forest, gradient boosting or neural networks. These algorithms have many strengths but also share a major weakness, which is the lack of interpretability of a model structure. A single random forest, an xgboost model or a neural network may be parametrized with thousands of parameters which makes these models hard to understand.